What It Is
Speech recognition, or voice recognition, is a means of input for computer users in which the computer is able to receive and interpret spoken language. Using a combination of voice recognition software and computer hardware an individual can operate his/her computer and work in many software applications by talking to the computer. The functions of voice recognition software are commands and controls, dictation of text and data, and correction and editing. A computer user can navigate the desktop and various computer applications; dictate text into a word processing program to create, edit, format, save, and print documents; dictate data into spreadsheets; explore the Internet; and use a digital, portable recorder for later transfer of data to a computer. In essence, voice recognition has the potential to enable an individual to operate a computer hands-free, or by talking, rather than by keyboarding or pointing and clicking with a mouse.
How It Works
To use voice recognition as a means of computer input, software and hardware components are required. These include voice recognition software, an adequate microphone, a computer that meets minimum requirements or better to support the software, a sound card, and speakers or earphones for text-to-speech or screen reading output. In addition, the user must have the physical and cognitive skill levels to be able to operate voice recognition software and other software used in conjunction with voice recognition.
In order for the software to be used effectively to accurately transcribe a user’s voice, a commitment of time and willingness to learn to use the software must be must be made. Enrollment is the beginning step of using voice recognition. This is a process to train the computer to recognize an individual’s voice and speech patterns. The speaker is prompted to read several paragraphs of text. As the user talks, the software creates voice files for the specific user. The initial enrollment may take as little as ten minutes, but usually takes 15 to 20 minutes. Once the initial enrollment has occurred, additional training should be conducted. Proper use of the program following the initial enrollment is essential to the success of voice recognition. “The first few hours of dictation should be carefully monitored so that errors in recognition are corrected through the software and the student understands where the software is having difficulty understanding his or her voice.” (Follansbee, 2003) As the speaker begins using the software, he must learn to speak in a manner that the computer can understand. He must also learn to use the software, learning the commands and proper procedures for making corrections of misrecognized words. Correcting errors is a process in which a misrecognized word is replaced with the desired word. The voice recognition software programs have procedures for making these corrections. As the user dictates and makes corrections, the voice recognition software continues to refine voice files. To develop high recognition accuracy, correction of recognition errors should take place as the errors occur. There are also procedures for making revisions, or editing text that was recognized accurately. With proper training and correction of misrecognized words, high levels of accuracy can be attained. Accuracy rates for experienced users with properly trained speech files and an adequate computer system are reported to be as high as 97%. With experience, one should be able to dictate at least 90 words a minute on a fast PC, (which is as fast as a professional typist could achieve), and get around 97% accuracy of word recognition (i.e. the computer gets the word right the first time 97 times out of 100). (Dyslexic.com, 2002)
The individual using voice recognition can use it to control the computer and dictate text and data into various applications. Commands and controls allow for navigation of windows. The vocabulary for command and control is a specific set of terms used for control and navigation throughout the computer environment. Some voice recognition programs allow the user to create or download macros for additional control/command options. Dictation allows the user to enter text or data. Voice recognition can be used for creating text documents and can be viewed as writing by talking.
There are two types of voice recognition input, discrete and continuous. Discrete speech recognition requires the speaker to pause briefly between each word spoken. The computer attempts to recognize each word in isolation. Continuous voice recognition recognizes speech best when text is spoken in sentences and phrases. Most updated versions of voice recognition software use the continuous speech method. The exception is Dragon Dictate, where the user must leave a short gap between each word. Although Dragon Dictate is an old program and its ‘word-by-word’, ‘discrete’ speech recognition may seem a less ‘natural’ technique, it has some advantages for some groups of students. (Nisbet, P. 2003)
Some voice recognition programs have voice output including playback and text-to-speech features. Playback is the ability to play back the text that the speaker has dictated in the speaker’s own voice. Text-to-speech is the ability to read back the text as it was recognized, in a computer voice.
Those using voice recognition may have greater success using additional means of input. With voice recognition systems, the additional use of the keyboard and the mouse with dictation dramatically increases the efficiency of the program.(The Washington Assistive Technology Alliance, 1999). Some speech recognition products perform tasks within the areas of commands and controls, dictation of text and data, and correction and editing better than others, and the extent to which users may utilize functions and features within these areas exclusively by voice varies from one speech recognition product to another. All will require some keyboard or mouse input to accomplish certain tasks - none are 100% hands-free. (An Introduction to Continuous Speech Recognition Software)
Products under revision
Original author unknown. Created by a participant in the Advanced Technology Class at Pennsylvania College of Optometry
What It Is
Speech recognition, or voice recognition, is a means of input for computer users in which the computer is able to receive and interpret spoken language. Using a combination of voice recognition software and computer hardware an individual can operate his/her computer and work in many software applications by talking to the computer. The functions of voice recognition software are commands and controls, dictation of text and data, and correction and editing. A computer user can navigate the desktop and various computer applications; dictate text into a word processing program to create, edit, format, save, and print documents; dictate data into spreadsheets; explore the Internet; and use a digital, portable recorder for later transfer of data to a computer. In essence, voice recognition has the potential to enable an individual to operate a computer hands-free, or by talking, rather than by keyboarding or pointing and clicking with a mouse.
How It Works
To use voice recognition as a means of computer input, software and hardware components are required. These include voice recognition software, an adequate microphone, a computer that meets minimum requirements or better to support the software, a sound card, and speakers or earphones for text-to-speech or screen reading output. In addition, the user must have the physical and cognitive skill levels to be able to operate voice recognition software and other software used in conjunction with voice recognition.In order for the software to be used effectively to accurately transcribe a user’s voice, a commitment of time and willingness to learn to use the software must be must be made. Enrollment is the beginning step of using voice recognition. This is a process to train the computer to recognize an individual’s voice and speech patterns. The speaker is prompted to read several paragraphs of text. As the user talks, the software creates voice files for the specific user. The initial enrollment may take as little as ten minutes, but usually takes 15 to 20 minutes. Once the initial enrollment has occurred, additional training should be conducted. Proper use of the program following the initial enrollment is essential to the success of voice recognition. “The first few hours of dictation should be carefully monitored so that errors in recognition are corrected through the software and the student understands where the software is having difficulty understanding his or her voice.” (Follansbee, 2003) As the speaker begins using the software, he must learn to speak in a manner that the computer can understand. He must also learn to use the software, learning the commands and proper procedures for making corrections of misrecognized words. Correcting errors is a process in which a misrecognized word is replaced with the desired word. The voice recognition software programs have procedures for making these corrections. As the user dictates and makes corrections, the voice recognition software continues to refine voice files. To develop high recognition accuracy, correction of recognition errors should take place as the errors occur. There are also procedures for making revisions, or editing text that was recognized accurately. With proper training and correction of misrecognized words, high levels of accuracy can be attained. Accuracy rates for experienced users with properly trained speech files and an adequate computer system are reported to be as high as 97%. With experience, one should be able to dictate at least 90 words a minute on a fast PC, (which is as fast as a professional typist could achieve), and get around 97% accuracy of word recognition (i.e. the computer gets the word right the first time 97 times out of 100). (Dyslexic.com, 2002)
The individual using voice recognition can use it to control the computer and dictate text and data into various applications. Commands and controls allow for navigation of windows. The vocabulary for command and control is a specific set of terms used for control and navigation throughout the computer environment. Some voice recognition programs allow the user to create or download macros for additional control/command options. Dictation allows the user to enter text or data. Voice recognition can be used for creating text documents and can be viewed as writing by talking.
There are two types of voice recognition input, discrete and continuous. Discrete speech recognition requires the speaker to pause briefly between each word spoken. The computer attempts to recognize each word in isolation. Continuous voice recognition recognizes speech best when text is spoken in sentences and phrases. Most updated versions of voice recognition software use the continuous speech method. The exception is Dragon Dictate, where the user must leave a short gap between each word. Although Dragon Dictate is an old program and its ‘word-by-word’, ‘discrete’ speech recognition may seem a less ‘natural’ technique, it has some advantages for some groups of students. (Nisbet, P. 2003)
Some voice recognition programs have voice output including playback and text-to-speech features. Playback is the ability to play back the text that the speaker has dictated in the speaker’s own voice. Text-to-speech is the ability to read back the text as it was recognized, in a computer voice.
Those using voice recognition may have greater success using additional means of input. With voice recognition systems, the additional use of the keyboard and the mouse with dictation dramatically increases the efficiency of the program. (The Washington Assistive Technology Alliance, 1999). Some speech recognition products perform tasks within the areas of commands and controls, dictation of text and data, and correction and editing better than others, and the extent to which users may utilize functions and features within these areas exclusively by voice varies from one speech recognition product to another. All will require some keyboard or mouse input to accomplish certain tasks - none are 100% hands-free. (An Introduction to Continuous Speech Recognition Software)
Products under revision
Original author unknown. Created by a participant in the Advanced Technology Class at Pennsylvania College of Optometry